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ABSTRACT "Candidatus Synechococcus spongiarum" is a cyanobacterial symbiont widely distributed in sponges, but its func- 
tions at the genome level remain unknown. Here, we obtained the draft genome (1.66 Mbp, 90% estimated genome recovery) of 
"Ca. Synechococcus spongiarum" strain SH4 inhabiting the Red Sea sponge Carteriospongia foliascens. Phylogenomic analysis 
revealed a high dissimilarity between SH4 and free-living cyanobacterial strains. Essential functions, such as photosynthesis, the 
citric acid cycle, and DNA replication, were detected in SH4. Eukaryoticlike domains that play important roles in sponge- 
symbiont interactions were identified exclusively in the symbiont. However, SH4 could not biosynthesize methionine and poly- 
amines and had lost partial genes encoding low-molecular-weight peptides of the photosynthesis complex, antioxidant enzymes, 
DNA repair enzymes, and proteins involved in resistance to environmental toxins and in biosynthesis of capsular and extracellu- 
lar polysaccharides. These genetic modifications imply that "Ca. Synechococcus spongiarum" SH4 represents a low-light- 
adapted cyanobacterial symbiont and has undergone genome streamlining to adapt to the sponge's mild intercellular environ- 
ment. 

IMPORTANCE Although the diversity of sponge-associated microbes has been widely studied, genome-level research on sponge 
symbionts and their symbiotic mechanisms is rare because they are unculturable. "Candidatus Synechococcus spongiarum" is a 
widely distributed uncultivated cyanobacterial sponge symbiont. The genome of this symbiont will help to characterize its evo- 
lutionary relationship and functional dissimilarity to closely related free-living cyanobacterial strains. Knowledge of its adaptive 
mechanism to the sponge host also depends on the genome-level research. The data presented here provided an alternative strat- 
egy to obtain the draft genome of "Ca. Synechococcus spongiarum" strain SH4 and provide insight into its evolutionary and 
functional features. 



Received 22 January 2014 Accepted 6 March 2014 Published 1 April 2014 

Citation Gao Z-M, Wang Y, Tian R-M, Wong YH, Batang ZB, Al-Suwailem AM, Bajic VB, Qian P-Y. 2014. Symbiotic adaptation drives genome streamlining of the cyanobacterial 
sponge symbiont "Candidatus Synechococcus spongiarum." mBio 5(2):e00079-14. doi:10.1 128/mBio.00079-14. 
Editor Maria Domlnguez Bello, New York University School of Medicine 

Copyright © 201 4 Gao et al. This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-ShareAlike 3.0 Unported license, 
which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original author and source are credited. 
Address correspondence to Pel-Yuan Qian, boqianpy@ust.hk. 



AS one of the oldest, most primitive metazoans, sponges are 
distributed globally and play important ecological roles ( 1- 
3). The association of symbiotic microbes with the sponge was 
identified several decades ago (4). Since then, sponge-associated 
microbial communities and their diversity have been studied ex- 
tensively (5). Pyrosequencing techniques further facilitated the 
investigation of sponge-associated microbes (6, 7). A recent study 
reported up to 32 bacterial phyla and candidate phyla in sponges 
(7). To some extent, sponge-associated microbial communities 
showed sponge-species specificity and tropical-subtropical dis- 
similarity (6, 7). In addition, dissimilarity in the composition of 
microbial communities between sponges with high and low mi- 
crobial abundance has been identified (8). 



Diverse symbiotic microbes in sponges function in nitrogen 
fixation, nitrification, photosynthesis, and sulfate reduction and 
affect the health, ecological distribution, and evolutionary pro- 
cesses of the host (5, 9). However, most sponge symbionts are 
unculturable and some fall into sponge-specific clusters (10), 
which makes it difficult to understand their functions and symbi- 
otic mechanisms at the genome level. To date, researchers have 
only obtained one complete genome of sponge symbiotic mi- 
crobes that belong to the psychrophilic crenarcheon Cenarchaeum 
symbiosum (11). On the other hand, a draft genome of an uncul- 
tured deltaproteobacterium in association with the sponge Cym- 
bastela concentrica was extracted from metagenomic data using 
the tetranucleotide frequency method (12). The single-cell 
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FIG 1 Neighbor-joining phylogenetic tree of strain SH4 and closely related cyanobacterial strains based on partial 16S rRNA genes (1,100 bp). CSS, cluster of 
"Ca. Synechococcus spongiarum." Bootstrap values based on 1,000 replications are shown as percentages at branch angles. Scale bar indicates 1% estimated 
sequence divergence. 



method has also been used to study the genomes of poribacterial 
symbionts in marine sponges (13). However, our knowledge of 
sponge symbionts at the genome level remains limited. 

Cyanobacteria represent one of the most common members of 
the sponge-associated microbial communities and are considered 
to play important roles in photosynthesis, nitrogen fixation, UV 
protection, and defensive toxin production (5,14). Identified cya- 
nobacterial sponge symbionts belong to Synechocystis, Aphano- 
capsa, Anabaena, Oscillatoria, and "Candidatus Synechococcus 
spongiarum" (5). "Ca. Synechococcus spongiarum," proposed by 
Usher et al. (15), was found in at least 40 sponge species and 
represents the largest sponge-specific cluster to date (10). Electron 
and fluorescence micrographs of "Ca. Synechococcus spongia- 
rum" symbionts revealed their presence in the intercellular envi- 
ronment of host sponges and provided evidence for their interac- 
tion with sponge amebocytes (16). In addition, vertical 
transmission of "Ca. Synechococcus spongiarum" from parents to 
offspring has been reported (17). Although the genetic differenti- 
ation of "Cfl. Synechococcus spongiarum" is considered to be very 
low among populations from different host species or geographi- 
cal regions according to the similarity of the 16S rRNA genes, their 
internal transcribed spacer (ITS) region displays high variations 
(18). The functional properties of this highly prevalent sponge 
symbiont and its symbiotic interaction with sponges remain un- 
clear. Marine picocyanobacteria of the genera Prochlorococcus and 
Synechococcus overwhelmingly dominate the picophytoplankton 
of the world ocean and contribute vitally to global primary pro- 
duction (19, 20). The features that distinguish "Ca. Synechococ- 



cus spongiarum" from free-living picocyanobacteria and the 
mechanism underlying its ability to adapt to the symbiotic part- 
nership are still unclear. Genomic analyses may provide answers 
to these questions. 

According to our previous data, "Ca. Synechococcus spongia- 
rum" is highly abundant in the sponge Carteriospongia foliascens, 
collected from the Red Sea, and represents the dominant cyano- 
bacterial symbiont (see Fig. SI in the supplemental material), thus 
permitting the extraction of its genome from the microbial com- 
munity. The development of bioinformatics also makes it feasible 
to obtain genomic sequences of uncultured bacteria from multiple 
metagenomes by genome binning based on differential coverage 
and tetranucleotide frequency (12, 21). Here, we report a draft 
genome of "Ca. Synechococcus spongiarum" strain SH4 extracted 
from metagenomic data. Using the extracted genome, we exam- 
ined the evolutionary relationship and functional dissimilarity of 
"Ca. Synechococcus spongiarum" SH4 with closely related free- 
living cyanobacterial strains and so pave the way to understand its 
adaptive mechanism to the sponge-symbiont partnership. 

RESULTS AND DISCUSSION 

Genome binning. Metagenomic DNA of the microbial commu- 
nity in the Red Sea sponge C. foliascens was subjected to 454 py- 
rosequencing, and metagenomic reads were assembled using GS 
De Novo Assembler (Newbler). A full-length cyanobacterial 16S 
rRNA gene was predicted from the assembled metagenomic con- 
tigs and completely matched with the dominant cyanobacteria 
(represented by operational taxonomic unit 1691 [OTU1691]; see 
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TABLE 1 General features and functional comparison o{ "Candidatus Synechococcus spongiarum" SH4 and related picocyanobacterial strains 

Value for indicated taxon" 



Genome feature or gene function 1 2345678 
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" Taxa: 1, "Ca. Synechococcus spongiarum" SH4; 2, Synechococcus sp. strain RCC307; 3, Synechococcus sp. strain RS9917; 4, Synechococcus sp. strain WH 5701; 5, Synechococcus sp. 
strain CC931 1; 6, P. marinus CCMP137; 7, Cyanobium sp. strain PCC 7001; 8, Cyanobium gracile PCC 6307. Genome recovery: F, finished genome; D, draft genome. 



Fig. SI in the supplemental material) in the sponge C. foliascens. A 
phylogenetic tree based on the predicted 16S rRNA gene indicated 
that this symbiont is distantly related to free-Hving Synechococcus 
and Prochlorococcus species and that it groups together with se- 
quences from "Ca. Synechococcus spongiarum" derived from 
various sponge species ( 15, 18) and shares more than 99% identity 
with them (Fig. 1). This cyanobacterial symbiont in the sponge 
C. foliascens was designated "Candidatus Synechococcus spongia- 
rum" strain SH4. Analysis of 16S rRNA genes in 454 metagenomic 
reads revealed a consistently high abundance of the symbiont "Ca. 
Synechococcus spongiarum" SH4 (see Fig. S2 in the supplemental 
material). Among the 16S rRNA reads, 67% (158/236) were as- 
signed to Cyanobacteria, of which 155 reads matched the 16S 
rRNA gene of SH4 with more than 99% identity. These reads were 
thus sorted to "Ca. Synechococcus spongiarum" SH4. The high 
abundance of SH4 in the metagenome data implied that the ge- 
nome coverage of SH4 in the assembled contigs (average coverage, 
-28 X) was much higher than those of other sponge-associated 
microbes, which facilitated distinguishing contigs of SH4 from the 
others. 

Using a combination of genome coverage and tetranucleotide 
frequency patterns and taking the GC content and essential genes 
into account (21), we extracted the genome of "Ca. Synechococ- 
cus spongiarum" SH4 from the assembled metagenomic contigs 
(Table 1; see Fig. S3 in the supplemental material). The draft ge- 
nome of SH4 was 1.6 Mbp in length, with a GC content of 63.4%. 
A total of 96 out of 106 single-copy, essential genes were identified 
in the draft genome, assuming a genome recovery of 90%. Al- 
though the genome recovery was not effectively complete, the 
draft genome is good enough to permit analysis of the functional 
properties of the sponge symbiont "Ca. Synechococcus spongia- 
rum." To validate the occurrence of genome reduction, the genes 



of interest that were lacking in the SH4 draft genome were checked 
again in the remaining assembled metagenomic contigs with 
lengths longer than 500 bp and 454 metagenomic coverage higher 
than8X. 

Phylogenomic inference. The explosive growth of genomic 
data has provided conserved marker genes as alternatives to 16S 
rRNA genes for phylogenetic inference (22). Here, a phylog- 
enomic tree based on 31 concatenated marker genes (Fig. 2a) re- 
vealed the evolutionary distinction of SH4 from picocyanobacte- 
ria of the genera Synechococcus, Prochlorococcus, and Cyanobium 
cluster Synechococcus and supported its evolutionary divergence 
from free-living cyanobacteria (18). The bipartition point where 
SH4 branched from other picocyanobacteria suggested that SH4 
was an independent cyanobacterial lineage that had adapted to the 
symbiotic lifestyle for a long period of time. This is in accord with 
previous findings demonstrating that these symbionts are verti- 
cally inherited from the parent sponges (17). Average nucleotide 
identity (ANI) and tetranucleotide frequency are powerful tools 
for comparison analysis of genome composition (23). SH4 
showed higher ANIs and tetranucleotide frequency similarity with 
Synechococcus and Cyanobium cluster Synechococcus than to Pro- 
chlorococcus (Fig. 2b). In addition, SH4 represented one of the 
cyanobacterial strains with the highest GC contents (>60%) and 
was more similar to Synechococcus and Cyanobium cluster Syn- 
echococcus than to Prochlorococcus (Fig. 2c). These results indi- 
cated that SH4 was more closely related to free-living Synechococ- 
cus than to Prochlorococcus. Accordingly, closely related strains 
affiliated with Synechococcus and Cyanobium cluster Synechococ- 
cus were selected for the genome-level functional comparison with 
SH4 (Table 1). The low-light- adapted cyanobacterium Prochloro- 
coccus marinus strain SS120 (CCMP1375 in Table 1), with a nearly 
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FIG 2 Phylogenomic inference of "Ca. Synecliococcus spongiarum" SH4. (a) Maximum-likeliliood pliylogenomic tree using 31 concatenated conserved 
proteins oi^ SH4 and cyanobacterial strains. Tlie branclies were colored according to ttie bootstrap values, ranging from purple (80%) to red (100%). Scale bar 
indicates 10% estimated sequence divergence, (b) ANI values plus correlation indexes of tetranucleotide frequency between SH4 and closely related Synechoc- 
occiis, Prochlorococcus, Cyanobium cluster Synechococcus, and other cyanobacterial strains revealed differences in their genome composition, (c) GC content and 
genome size of SH4 and other cyanobacterial strains. 



minimal oxyphototrophic genome, was also included in the com- 
parison analysis. 

Functional features at the genomic level. The numbers of 
genes present in selected pathways of "Ca. Synechococcus spon- 
giarum" SH4 and free-living picocyanobacteria strains are pre- 
sented in Table 1; these data suggest the near completeness of key 
functional pathways, including photosynthesis, the citric acid cy- 
cle (tricarboxylic acid [TCA] cycle), DNA replication, and pepti- 
doglycan biosynthesis. As a sponge symbiont, SH4 also displayed 
unique symbiotic features, with the highlight being genome 
streamlining following the loss of unnecessary genes in several 
pathways (Table 1). Previous studies have revealed the distribu- 
tion and evolutionary divergence of the sponge symbiont "Ca. 
Synechococcus spongiarum" using molecular ecology techniques 



(18). Although "Ca. Synechococcus spongiarum" was thought to 
enhance host metabolism and ecological fitness by providing a 
photosynthesis-derived carbon source, its functional properties at 
the genome level remain unclear. Here, the extracted genome pro- 
vided direct insights into the carbon and energy metabolism of 
"Ca. Synechococcus spongiarum" SH4 and its symbiotic adapta- 
tion to the sponge host. 

Host-symbiont interaction. Proteins with ankyrin repeats 
(ARs), leucine-rich repeats (LRRs), and fibronectin type III do- 
mains were enriched in SH4 but rare in free-living cyanobacteria 
(Table 1). Proteins with eukaryoticlike domains, such as ARs, tet- 
ratricopeptide repeats (TPRs), LRRs, NHL repeats (PF01436), 
and fibronectin type III, have been reported to be enriched in 
sponge symbiotic microbes (12, 24) and were suggested to mod- 
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FIG 3 Schematic overview of the methionine metabolism pathway in "Ca. Synechococcus spongiarum" SH4. Red and green labels indicate the absence and 
presence, respectively, of that enzyme in SH4. 



ulate host behavior by interfering with eukaryotic protein-protein 
interactions. Specially, AR proteins from sponge symbionts mod- 
ulate amoebal phagocytosis and might help symbionts escape di- 
gestion by the sponge host (25). The enrichment of these domains 
in SH4 is consistent with the role of "Ca. Synechococcus spongia- 
rum" as a sponge symbiont. Interestingly, the number of TPR 
proteins was lower in SH4 than in several free-living cyanobacte- 
rial strains (Table 1). The TPR motif was originally identified in 
yeast (26) but was recently found in a wide variety of prokaryotes 
and suggested to be involved in numerous cell processes (27). The 
hypothesis that TPR functions as a symbiotic factor in the sponge- 
symbiont interaction requires further careful evaluation. 

Amino acid metabolism. Although obligatory symbiotic bac- 
teria tend to lose essential metabolic pathways that are required 
for free-living organisms, especially those responsible for amino 
acid metabolism (28, 29), the numbers of genes in most of the 
amino acid metabolism pathways of "Ca. Synechococcus spongia- 
rum" SH4 were similar to the numbers in free-living cyanobacte- 
rial strains (see Fig. S4 in the supplemental material). Surprisingly, 
the cysteine and methionine metabolism pathway was dramati- 
cally reduced (Table 1). In-depth analysis showed that there was 
no enzyme for the de novo biosynthesis of the methionine precur- 
sor (homocysteine), although methionine synthase (metH) was 
present in the SH4 draft genome (Fig. 3). In addition to de novo 
biosynthesis, the methionine salvage pathway is an important 
metabolic pathway for maintaining the concentration of 
L-methionine in bacteria (30). Key genes in the methionine sal- 
vage pathway, including S-adenosylmethionine decarboxylase 
(speD), spermidine synthase (speE), methylthioribose-1- 
phosphate isomerase (mttiA), and l,2-dihydroxy-3-keto-5- 
methylthiopentene dioxygenase (mtnD), were lost (Fig. 3). The 
lack of de novo biosynthesis and salvage of methionine suggested 
that the essential methionine might be provided by exogenous 



sources. In the methionine salvage pathway, SpeD and SpeE are 
two enzymes responsible for the biosynthesis of spermidine, a 
prevalent polyamine found in bacteria (31). Previous studies have 
shown that polyamines in bacteria play important roles in optimal 
cell grovrth, signaling cell differentiation, DNA protection, bio- 
film formation, and antibiotic resistance (31). High-performance 
liquid chromatograph analysis revealed that polyamines were 
widely distributed in cyanobacteria, which indicates their impor- 
tant roles for the cyanobacteria (32). SH4 lives in the sponge in- 
tercellular environment (18) and likely acquires exogenous poly- 
amines therein. "Ca. Synechococcus spongiarum" has been 
shown to interact with host amebocytes ( 1 8), a mobile cell respon- 
sible for food digestion. Amebocytes might digest food and release 
nutrients to satisfy the necessities, such as polyamines and methi- 
onine, for the cyanobacterial symbiont. Since other symbiotic 
bacteria were also found in the sponge host (Fig. SI), they might 
be an alternative source of these essential chemicals for "Ca. Syn- 
echococcus spongiarum." 

Photosynthetic system. "Ca. Synechococcus spongiarum" has 
been reported to contain phycocyanin and phycoerythrin (15). 
Here, we showed that "Ca. Synechococcus spongiarum" SH4 con- 
tains genes encoding all three types of antenna proteins (phyco- 
cyanin, phycoerythrin, and allophycocyanin) (Fig. 4), which sug- 
gests that this symbiont absorbs a wide spectrum of light for 
photosynthesis. The eight subunits of F-type ATPase were identi- 
fied. Compared to free-living cyanobacterial strains, however, 
genes in photosystem II (PSII), including psbP, psbl, psbK, psbM, 
and psbY, were missing (Fig. 4). In further analysis, these genes 
could not be detected in the entire assemblage of metagenomic 
contigs or raw pyrosequencing reads. PsbP, together with PsbO, 
PsbQ, PsbU, and PsbV, form the oxygen-evolving complex (OEC) 
of PSII in cyanobacteria. This complex oxidizes water to provide 
protons for photosystem I (PSI). Synechocystis sp. strain PCC 6803 



March/April 2014 Volume 5 Issue 2 e00079-14 



mfiio' mbio.asm.org 5 



Gao et al. 



Gene copy 



0 12 3 4 



Photosynthetic 
system II (PSII) 



Photosynthetic 
system I (PSI) 



Cytochrome b6/f 
F-type ATPase 

Phycobilisome 



8 



5^ 



K02703_psbA 
K02704_psbB 
K02705_psbC 
K02706j)SbD 
K027a7 pSbE 
K02708_pSbF 
K02709_pSbH 
K02710_psbl 
K027n_pSbJ 
K02712j)sbK 
K02713j)sbt 
K02714_psbM 
K02716_psbO 
K02717_psbP 
K027iejDSbT 
K02719_psbU 
K02720_psbV 
K02722_pSbX 
K02723_pSbY 
K02724 psbZ 



K08903_psb28 
K02689^psaA 
K02690_psaB 
K02691_psaC 
K02692_psaD 
K02693_psaE 
K02694_psaF 



K02700_psaM 

K02634j)etA 

K02635_petB 

K02636_petC 

K02637_petD 

K02M0_petG 

K02642_petL 

K02643_petM 

K03689 pelN 

K0263e_petE 

K02639 pelF 

K026<l1_pelH 

KOa906_peU 

K021l4_alpC 

K021 12_alpD 

K021 15_alpG 

K02111_atpA 

K02113 atpH 

K02109_alpF i 

K02110 atpE 

K02108_alpB 

K02092_apcA 

K02093 apcB 

K02094„apcC 

K02095„apcD 

K02096_apcE 

K02097_apcF 

K02284_cpcA 

K02285„{:pcB 

K022e6 cpcC 

K02287_CpcD 

K02288 CpcE 

K02289_cpcF 

K02290 cpcG 

K05376„cpeA 

K05377 cpeB 

K0537e_cpeC 

K05379_cpeO 

K05380_cpeE 

K053ei_cpeR 

K05382 cpeS 

K05383_CpeT 

K05384 CpeU 

K05385_cpeY 

K05386 cpeZ 



^ 8 



8 



FIG 4 Abundance of photosynthetic genes in "Ca. Synechococcus spongia- 
rum" SH4 and related cyanobacteria strains based on KEGG orthology anno- 
tation. The identities of the cyanobacterial strains are described in Table 1. 



mutants with inactive PsbP exhibit reduced photoautotrophic 
growth (33). Psbl, PsbK, PsbM, and PsbY are low-molecular- 
weight peptides involved in the assembly, stabilization, dimeriza- 
tion, and photoprotection of the photosynthetic center of PSII 

(34) . The loss of PsbP and low-molecular- weight peptides implied 
that the PSII complex of "Ca. Synechococcus spongiarum" SH4 is 
less stable than those of free-living strains and may represent a 
low-light-adapted photosynthetic system (35). Several genes en- 
coding low-molecular-weight peptides in the PSI complex and 
cytochrome bjf were also not found in SH4 (Fig. 4). The abnor- 
malities observed in the photosynthetic system of "Ca. Synechoc- 
occus spongiarum" SH4 might represent a protective mechanism 
against damage caused by a high dosage of photosynthesis-derived 
oxygen and/or oxidative stress to the sponge host (36). 

Resistance to oxidative stress. Reactive oxygen species (ROS) 
are by-products of aerobic metabolism and can cause intracellular 
oxidative damage. ROS generated by the photosynthetic electron 
transport chain pose a significant threat to photosynthetic organ- 
isms, such as cyanobacteria (36). The ability to rapidly perceive 
ROS and initiate antioxidant defense is crucial for the survival of 
these organisms. In cyanobacterial strains, antioxidant enzymes 
play important roles in resistance to oxidative stress (36). How- 
ever, "Ca. Synechococcus spongiarum" SH4, as a photosynthetic 
microbe, has lost several antioxidant enzymes, including superox- 
ide dismutase (SOD), glutathione peroxidase (GPX), and DNA- 
binding protein (Dps) (Fig. 5; see Table SI in the supplemental 
material). P. marinus strain CCMP1375, a low-light-adapted pi- 
cocyanobacteria with a nearly minimal oxyphototrophic genome 

(35) , also lacks SOD and Dps (Fig. 5) and escapes from oxidative 
damage through living at the bottom of the Oluminated layer (35). 
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FIG 5 Genes of "Ca. Synechococcus spongiarum" SH4 and related cyanobacterial strains important for resistance to oxidative stress, antibiotics, and 
environmental toxins based on SEED/Subsystems annotation. See details in Tables S 1 and S2. The identities of the cyanobacterial strains are described in Table 1 . 
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The observed loss of antioxidant enzymes consistently confirmed 
that SH4 is a low-light-adapted organism. However, different 
from the antioxidant mechanism used by P. marinus CCMP1375 
(35), "Ca. Synechococcus spongiarum" SH4 lives in the mild in- 
tercellular environment of the sponge host (16), which provides 
the barrier of the sponge body to forbid too much sunlight arriv- 
ing at the cyanobacterial symbiont, and thus it avoids the oxida- 
tive damage caused by a highly efficient photosynthesis process. 

Resistance to antibiotics and toxic compounds. In addition to 
the loss of genes encoding antioxidant enzymes, there was also a 
dramatic reduction of genes involved in resistance to antibiotics 
and environmental toxins in "Ca. Synechococcus spongiarum" 
SH4 (Fig. 5; see Table S2 in the supplemental material). There was 
a depletion of genes encoding proteins involved in arsenic resis- 
tance, multidrug resistance efflux pumps, integrase, beta- 
lactamase, and the negative regulator of beta-lactamase. Genes 
encoding cobalt-zinc-cadmium and methicillin resistance were 
also dramatically reduced in "Ca. Synechococcus spongiarum" 
SH4 compared to their occurrence in other free-living cyanobac- 
terial strains. Interestingly, a large fraction of these genes were also 
lost in the genome of Prochlorococcus marinus CCMP1375. Cya- 
nobacteria are a large and highly diverse group of photosynthetic 
prokaryotes which can adapt to various habitats, including those 
containing natural and artificial antibiotics and heavy metals (37). 
Resistance to these toxins in open water is important for the sur- 
vival of these organisms (38). However, "Ca. Synechococcus 
spongiarum," which inhabits the mild intercellular environment 
of the sponge host (16), can evade these toxins via the barriers of 
the sponge host. Accordingly, genes involved in resistance to en- 
vironmental factors are not required and were lost during the 
evolutionary development of this symbiont. 

Cell wall and capsule composition. Similar to the case for 
closely related free-living cyanobacterial strains, most of the genes 
responsible for the biosynthesis of peptidoglycan and lipopolysac- 
charide (LPS) could be found in SH4 (Table 1). The presence of 
these genes allows the formation of a rigid cell wall and is consis- 
tent with the characteristic spiral thylakoid membrane of "Ca. 
Synechococcus spongiarum" observed by transmission electron 
microscopy (16). Interestingly, genes responsible for the biosyn- 
thesis of capsular polysaccharide (CPS) and extracellular polysac- 
charides (EPS) were almost completely lost (Table 1). CPS and 
EPS are extracellular products of a wide range of microorganisms, 
including cyanobacteria (39), and play important roles, such as 
protection against environmental stresses (40), biofilm formation 
(41), and survival against phagocytosis or antibiotics (42, 43). The 
absence of CPS and EPS further indicated that SH4 has a low 
resistance to environmental stresses. However, this characteristic 
is likely to diminish the barrier between the symbiont and sponge 
cells, thus benefiting sponge-symbiont interactions and nutrient 
exchange. 

DNA replication and repair. Just like other reported bacterial 
symbionts (28), "Ca. Synechococcus spongiarum" SH4 retained 
the same set of genes for DNA replication as are found in closely 
related free-living cyanobacterial strains, but genes for DNA re- 
pair capabilities are limited (Table 1). Although the base excision 
repair and nucleotide repair pathway were found in SH4, the ex- 
onuclease Exo VII complex in the mismatch repair pathway, the 
exonuclease V complex (RecBCD) in the homologous recombi- 
nation pathway, and ATP-dependent DNA ligase were not de- 
tected (Table 1; see Table S3 in the supplemental material). There 



are also reports of a reduction of DNA repair genes in the symbi- 
otic cyanobacterium UCYN-A (44) and in the cyanobacterium- 
originating inclusions termed chromatophores (28). The absence 
of DNA repair genes in "Ca. Synechococcus spongiarum" SH4, 
similar to many cases of bacterial symbionts with extraordinarily 
small genomes (45), likely facilitates the evolution of the genome 
to adapt to the symbiotic partnership. 

Overview of the possible lifestyle of "Ca. Synechococcus 
spongiarum" SH4. Based on the analysis of the extracted draft 
genome and its comparison with those of free-living picocyano- 
bacteria, we proposed schematic functional features and adaptive 
schemes of "Ca. Synechococcus spongiarum" SH4 to the sponge- 
symbiont partnership (Fig. 6). Proteins with eukaryoticlike do- 
mains were identified in SH4, which is consistent with the role of 
"Ca. Synechococcus spongiarum" as a sponge symbiont. Genes 
involved in the biosynthesis of methionine and spermidine were 
lost, suggesting that "Ca. Synechococcus spongiarum" depends 
on the sponge host and/or the other sponge symbionts for essen- 
tial nutrients and chemical factors. The presence of a functional 
photosynthesis pathway in this symbiont guarantees a steady car- 
bon supply to the host and ensures its ecological success (5). How- 
ever, the photosynthetic system in SH4 might be unstable and has 
a low efficiency due to the reduction of PsbP in the OEC complex 
and the loss of several low-molecular-weight peptides. Further- 
more, SH4 should have a low resistance to oxidative stress because 
of the loss of several antioxidant enzymes. These features indicate 
that "Ca. Synechococcus spongiarum," similar to Prochlorococcus 
marinus SS120, should be a low-light-adapted picocyanobacte- 
rium but uses the alternative strategy of symbiosis for adaptation 
to low light (35). "Ca. Synechococcus spongiarum" also had a low 
resistance to environmental antibiotics and toxins, which may be 
further compromised by the defect in the biosynthesis of CPS and 
EPS. These features force the symbiont to inhabit the mild inter- 
cellular environment of the host. However, the defect in the bio- 
synthesis of CPS and EPS may represent a mechanism used to 
diminish the barrier between symbiont and sponge cells to benefit 
sponge-symbiont interactions and nutrient exchange. In addition, 
the loss of DNA repair genes may play roles in facilitating the 
genome evolution of "Ca. Synechococcus spongiarum" SH4 to 
adapt to the sponge-symbiont partnership. 

Summary. Picocyanobacteria in the genera Prochlorococcus 
and Synechococcus numerically dominate the picophytoplankton 
communities of the world's ocean (19, 20). During their adapta- 
tion to open ocean and coastal environments, these organisms 
have overcome various environmental stresses (35, 46). In con- 
trast to free-living picocyanobacteria, "Ca. Synechococcus spon- 
giarum" is a sponge symbiont. The exclusive detection of "Ca. 
Synechococcus spongiarum" in sponges (18), their vertical trans- 
mission between generations (17), and their large phylogenomic 
dissimilarity to free-living picocyanobacteria (Fig. 2) suggest an 
intimate symbiotic relationship between "Ca. Synechococcus 
spongiarum" and the sponge host. The draft genome of "Ca. Syn- 
echococcus spongiarum" SH4 provided further insight into the 
adaptive mechanism of this intercellular symbiont to live in the 
sponge host (Fig. 6). 

Although the draft genome is estimated to have a recovery of 
90% and is not effectively completed, the absence of certain genes 
has been confirmed through searching against the entire assem- 
blage of metagenomic contigs. However, the incomplete genome 
precludes the detection of several other potentially symbiotic fea- 
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FIG 6 Schematic of mode of life of the sponge symbiont "Ca. Synechococcus spongiarum." The schematic figure was deduced from the genomic analysis of the 
draft genome of strain SH4. 



tures, such as transposase-driven genome rearrangement and hor- 
izontal gene transfer (45). The recovery of an effectively complete 
genome by combining a metagenome and a single-cell-derived 
genome, perhaps even using the bacterial artificial chromosome 
(BAG) library method, should further elucidate this symbiotic 
partnership. According to the observed intimate interdependence 
of the symbiont with the sponge and the phylogenomic dissimi- 
larity with free-living picocyanobacteria, our study suggests that 
the symbiotic partnership between "Ca. Synechococcus spongia- 
rum" and the sponge has been established for a long time. How- 
ever, this conclusion is inconsistent with the widespread distribu- 
tion but low genetic differentiation of "Ca. Synechococcus 
spongiarum" (47). Although cryptic diversity of "Ca. Synechoc- 
occus spongiarum" among sponges has been suggested based on 
the variation in ITS regions, which supports the potential cryptic 
genetic differentiation (18), additional evidence is required to 
confirm the intraspecies diversity and divergence of this sponge 
symbiont. Due to the high abundance of "Ca. Synechococcus 
spongiarum" in the sponge C.foliascens, binning genomes of these 
symbionts from multiple individuals of this sponge species lo- 
cated in a single and/or different geographical sites wiU improve 
our knowledge about their genetic diversity and differentiation. 

MATERIALS AND METHODS 

Sample collection and DNA extraction. Sponge tissues collected from 
site RB4 (22°44'56"N, 38°59'35"E), located in the Rabigh Bay of Saudi 



Arabia along the Red Sea coast, in April 2012, were placed into separate 
sterile plastic bags and immediately transported back to the laboratory. 
Sponge tissues were flushed using 0.22-p,m-membrane-filtered seawater 
to remove loosely attached microbes and debris. Ten-milliliter amounts 
of the flushed sponge tissues were preserved in 70% ethanol for identifi- 
cation of the sponge species. The tissues (0.5 ml) were dissected, cut into 
small pieces with a sterile razor blade, and frozen in 0.8 ml of extraction 
buffer (100 mM Tris-HCl, 100 mM EDTA, 100 mM Na2HP04, 1.5 M 
NaCl, 1% cetyltrimethylammonium bromide [CTAB], pH 8.0) for DNA 
extraction. Total genomic DNA was extracted using the modified sodium 
dodecyl sulfate-based method described previously by Lee et al. (6) and 
purified using the Mo Bio soil DNA isolation kit (Mo Bio Laboratories, 
Carlsbad, CA, USA) according to the manufacturer's manual. The prod- 
ucts were qualified and quantified using a NanoDrop ND-100 device 
(Thermo Fisher, USA) and stored at — 20°C until use. 

Metagenome sequencing and assembly. Metagenomic DNA was se- 
quenced on a 454 FLX platform using Titanium chemistry. This produced 
a total of 315,119 reads with a total length of 160.2 Mbp. Raw reads were 
subjected to quality filtering by using the 454QC.pl script in the NGS QC 
(next-generation sequencing quality control) Toolkit with default param- 
eters (48), and reads smaller than 100 bp and containing more than 5 
ambiguous nucleotides were removed. Qualified reads with a total length 
of 157.5 Mbp were assembled using OS De Novo Assembler (Newbler) 
with a threshold of 100-bp overlap and 98% identity, which produced 
7,203 contigs longer than 100 bp with a total length of 9.8 Mbp. A total of 
1,114 contigs longer than 2,000 bp with a total length of 4.2 Mbp were 
used for genome binning. 
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Genome binning. Draft genome binning was carried out mainly based 
on genome coverage and tetranucleotide frequency patterns according to 
a previously described method (21 ), but several modifications were made. 
The 454 pyrosequencing reads were mapped to the assembled contigs 
using Bowtie2 (49), and the genome coverage was calculated with SAM- 
tools (50) and Perl scripts. Another sample that contained highly abun- 
dant sponge cells was subjected to DNA extraction and Illumina sequenc- 
ing. The Illumina reads were mapped to the assembled metagenomic 
contigs, and a secondary genome coverage was obtained. The tetranucle- 
otide fi'equency of the assembled contigs was calculated using Perl scripts 
written by Albertsen et al. (21). Principal component analysis of the tet- 
ranucleotide frequency was performed using the Vegan package 2.0-5. 
The open reading frames (ORFs) of the assembled contigs were predicted 
using Prodigal (51). A set of 107 hidden Markov models (HMM) of es- 
sential proteins (21) were searched against the predicted ORFs with de- 
fault cutoff values in the HMM datasets. The essential proteins identified 
were searched against the NCBI NR database with BLASTP (E value of 
le-05) and taxonomically assigned using MEGAN 4.0 (52). Contigs were 
labeled according to the phylum-level taxonomic affiliation of the essen- 
tial proteins. 

Using the previously described R pipelines (21), the genome of "Cci. 
Synechococcus spongiarum" SH4 was extracted from the assembled con- 
tigs (see Fig. S3 in the supplemental material). A group of contigs with 454 
metagenomic coverage of greater than 20 X showed high GC content and 
included most of the labeled cyanobacterial contigs. From these contigs, a 
core set of cyanobacterial contigs were extracted (Fig. S3, dataset 1). Be- 
cause several contigs that were assigned to the Cyanobacteria fell into a 
region with lower coverage than the core set, a binning was carried out on 
a larger region (Fig. S3, dataset 2) and another set of contigs were obtained 
(Fig. S3, dataset 3). The ORFs in each contig were taxonomically assigned 
using the BLASTP and MEGAN programs according to the method de- 
scribed above. If more than 50% of the ORFs of a contig in dataset 2 were 
assigned to the Cyanobacteria, that contig was also included in the SH4 
genome. Finally, a draft genome of "Ca. Synechococcus spongiarum" 
SH4, composed of 273 contigs, was obtained (Table 1). 

Genome analysis. For the KEGG annotation, predicted amino acid 
sequences of the SH4 draft genome and other reference genomes down- 
loaded from the NCBI Genome database were searched against the KEGG 
database (53) by using BLASTP with a maximum E value cutoff of le-05. 
Amino acid sequences were also searched against the GenBank NR data- 
base, and the output xml file was imported into MEGAN for taxonomic 
affiliation and SEED/Subsystems annotation (54). KEGG and SEED/Sub- 
systems annotations of the SH4 and closely related cyanobacterial rela- 
tives were compared to evaluate the genome reduction of the sponge 
symbiont "Ca. Synechococcus spongiarum." Genes of interest that could 
not be found in the SH4 draft genome were rechecked in the remaining 
assembled metagenomic contigs with lengths longer than 500 bp and 454 
metagenomic coverage greater than 8 X . Low-molecular-weight peptides 
in photosynthetic systems were also searched through the entire assem- 
blage of metagenomic contigs and raw pyrosequencing reads. If the over- 
looked gene was detected and assigned to the Cyanobacteria, it was con- 
sidered to be affiliated with "Ca. Synechococcus spongiarum" SH4. For 
the phylogenomic analysis, 31 proteins encoding phylogenetic markers 
were predicted from the SH4 draft genome and cyanobacterial genomes in 
the JGI database using AMPHORA (22). The sequences of each marker 
gene were ahgned individually using ClustalW (55). The aligned se- 
quences were concatenated, and a maximum-likelihood phylogenetic tree 
was constructed using PhyML (56) according to a previously described 
method (22). Bootstrap values were calculated with 100 replications. Av- 
erage nucleotide identity values (ANIs) and z scores of the tetranucleotide 
frequency between the extracted genome of SH4 and closely related cya- 
nobacterial genomes were calculated using JSpecies (23). Eukaryoticlike 
domains, including ARs, TPRs, LRRs, NHL repeats, fibronectin type III, 
and cadherins, were annotated by using the pfam_scan.pl script to search 



against the PFAM database (57) according to a previously described 
method (58). 

16S rRNA prediction and phylogenetic tree construction. The 16s 

rRNA genes in the qualified pyrosequencing reads and the metagenomic 
assembled contigs were predicted using Meta-RNA (59). Predicted 16S 
rRNA gene fragments that were longer than 100 bp were loaded into the 
online RDP classifier for taxonomic assignment. For phylogenetic analy- 
sis, a full-length cyanobacterial 16S rRNA gene predicted from the assem- 
bled contigs was searched against the NCBI GenBank database using 
BLASTN to detect closely related relatives. A neighbor-joining tree was 
constructed using MEGA5.1 software (60). Multiple alignment was per- 
formed using ClustalW (55). Distance matrices were calculated using 
Kimura's two-parameter correction model (61). Bootstrap values were 
determined with 1,000 replications. 

Nucleotide sequence accession numbers. The 16S rRNA gene se- 
quence of "Co. Synechococcus spongiarum" SH4 was deposited in the 
NCBI GenBank database under accession number KJ 174471. All the met- 
agenomic sequences are available in the NCBI Sequence Reads Archive 
(SRA) database under accession number SRP0355 16. The draft genome of 
the "Ca. Synechococcus spongiarum" has been deposited at GenBank 
under accession number lENAOOOOOOOO. The version described in this 
paper is version JENAOIOOOOOO. 
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